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Abstract 

In a sensor network, in practice, the communication among sensors is subject to: (1) errors or failures 
at random times; (2) costs; and (3) constraints since sensors and networks operate under scarce resources, 
such as power, data rate, or communication. The signal-to-noise ratio (SNR) is usually a main factor 
in determining the probability of error (or of communication failure) in a link. These probabilities are 
then a proxy for the SNR under which the links operate. The paper studies the problem of designing the 
topology, i.e., assigning the probabilities of reliable communication among sensors (or of link failures) 
to maximize the rate of convergence of average consensus, when the link communication costs are taken 
into account, and there is an overall communication budget constraint. To consider this problem, we 
address a number of preliminary issues: (1) model the network as a random topology; (2) establish 
necessary and sufficient conditions for mean square sense (mss) and almost sure (a.s.) convergence of 
average consensus when network links fail; and, in particular, (3) show that a necessary and sufficient 
condition for both mss and a.s. convergence is for the algebraic connectivity of the mean graph describing 
the network topology to be strictly positive. With these results, we formulate topology design, subject 
to random link failures and to a communication cost constraint, as a constrained convex optimization 
problem to which we apply semidefinite programming techniques. We show by an extensive numerical 
study that the optimal design improves significantly the convergence speed of the consensus algorithm 
and can achieve the asymptotic performance of a non-random network at a fraction of the communication 
cost. 
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I. Introduction 

We consider the design of the optimal topology, i.e., the communication configuration of a sensor 
network that maximizes the convergence rate of average consensus. Average consensus is a distributed 
algorithm that has been considered by Tsitsiklis in his PhD thesis, [1], see also [2], found application 
recently in several areas, and is the subject of active research, e.g,, [3], [4], [5], [6]. 

This topology design for sensor networks has not received much attention in the literature. Refer- 
ences [7] and [8] consider restrict it to classes of random graphs, in particular, small-world topologies. 
The more general question of designing the topology that maximizes the convergence rate, under a 
constraint on the number of network links, was considered in our previous work, [9], [10], [11], where 
we reduced to average consensus the problem of distributed inference in sensor networks; see also [12]. 

Realistic networks operate under stress: (1) noise and errors cause links to fail at random times; 
(2) communication among sensors entails a cost; and (3) scarcity of resources constrain sensors and 
networks operation. We model such a non-deterministic network topology as a random field. Specifically, 
we assume the following: 1) at each iteration of the consensus algorithm, a network link is active 
with some probability, referred to as link formation or utilization probability; 2) network links have 
different link formation probabilities; 3) links fail or are alive independently of each other; and 4) the 
link formation probabilities remain constant across iterations. Designing the network topology corresponds 
then to (1) fixing the probability, or fraction of time, each link is used, (2) knowing that communication 
among sensors may be cheap (e.g., sensors are geographically close), or expensive, and (3) recognizing 
that there is an overall budget constraint taxing the communication in the network. 

The paper extends our preliminary convergence results, [13], on networks with random links. The recent 
paper [14] adopts a similar model and analyzes convergence properties using ergodicity of stochastic 
matrices. Consensus with a randomized network also relates to gossip algorithms, [15], where only a single 
pair of randomly selected sensors is allowed to communicate at each iteration, and the communication 
exchanged by the nodes is averaged. In our randomized consensus, we use multiple randomly selected 
links at each iteration and, in contradistinction with [15], we design the optimal topology, i.e., the 
optimal weight (not simple average) and the optimal probabilities of edge utilization, recognizing that 
communication entails costs, and that there is a communication cost constraint. Other recent work on 
evolving topologies includes [16] that considers continuous time consensus in networks with switching 
topologies and communication delays, and [17] that studies distributed consensus when the network is a 
complete graph with identical link failure probabilities on all links. 

We outline the paper. Section II summarizes spectral graph theory concepts like the graph Laplacian L 
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and the graph algebraic connectivity A2CL). The Section formulates the problem of distributed average 
consensus with random link failures. Sections III and IV derive necessary and sufficient conditions for con- 
vergence of the mean state, mss convergence, and a.s. convergence in terms of the average E {A2 (L)} and 
in terms of A2 (L), where L = E (L). Section V presents bounds on the mss convergence rate. Section VI 
addresses the topology design for random networks with communication cost constraints. We formulate a 
first version of the problem, the randomized distributed consensus with a communication cost constraint 
(RCCC), and then an alternate version, which we show is a convex constrained optimization problem, to 
which we apply semidefmite programming (SDP) techniques. Section VII studies the performance of the 
topologies found by solving numerically the SDP optimization. We show that these designs can improve 
significantly the convergence rate, for example, by a factor of 3, when compared to geometric networks 
(networks where sensors communicate with every other sensor within a fixed radius) and that they can 
achieve practically the (asymptotic) performance of a nonrandom network at a fraction, e.g., 50 %, of 
the communication cost per iteration. Section VIII concludes the paper. 

II. Distributed Average Consensus 

Subsection II- A presents two network models: Model 1) Nonrandom topology in paragraph II-A.l; and 
Model 2) Random topology in paragraph II-A.2. Subsection II-B considers distributed average consensus 
with nonrandom topologies in Paragraph II-B.l and random topologies in Paragraph II-B. 2. We assume 
synchronous communication throughout. 

A. Nonrandom and Random Topologies 

In a nonrandom topology, the communication channels stay available whenever the sensors need to 
communicate. This model is described in paragraph II-A.l, where we recall basic concepts from graph 
theory. In many sensor network applications, it makes sense to consider that links among sensors may 
fail or become alive at random times. This models, for example, applications when the network uses an 
ARQ protocol and no acknowledgement packet is received within the protocol time window, in which 
case the transmitted packet is assumed to be dropped or lost. This is also the case, when the transmission 
is detected in error. The random topology introduced in paragraph II-A.2 models these networks. 

1) Nonrandom topology: The nonrandom topology is defined by an undirected graph G = (V, £), 
where V is the set of vertices that model the sensors and £ is the set of edges that model the communi- 
cation channels. We refer to G as the supergraph, £ as the superset of edges, and edges in £ as realizable 
edges or links. This terminology becomes better motivated when we consider the random topology in 



4 



Subsection II-A.2. The cardinalities of the sets \V\ = N and \£\ = M give the number of network sensors 
and the number of channels or links, respectively. For the complete graph G = (V, M), M. is the set of 
all possible N(N — l)/2 edges. In practice, we are interested in sparse graphs, i.e., M <C N(N — l)/2. 
We label a node or vertex by an integer n, where n G {1, N}. Sensors n and I communicate if there is 
an edge (n, I) G £. Since the graph is undirected, if n communicates with /, then I communicates with n. 
The graph is called simple if it is devoid of loops (self-edges) and multiple edges. It is connected if every 
vertex can be reached from any other vertex, which in network terms may require a routing protocol. 
The number d n of edges connected to vertex n is called the degree of the vertex. A graph is regular if 
every vertex has the same degree d. Unless otherwise stated, we consider only simple, connected graphs. 
Associated with the graph G is its N x N adjacency matrix A 



1 if(n,Z)€£ 
otherwise 



(1) 



The neighborhood structure of the graph is defined by 

VI < n < N : Q n = {I G V : (n, I) G £} (2) 

The degree of node n is also the cardinality of its neighborhood set 

VI < n < N : d n = \Q n \ (3) 

Let V = diag(di, ...,djy) be the degree matrix. The graph Laplacian matrix C is defined as 

C = V - A (4) 

The Laplacian £ is a symmetric positive-semidefmite matrix; hence, all its eigenvalues are non-negative. 
We order the Laplacian eigenvalues as 

= Ai(£) < A 2 (£) < ■ • • < \n(£) (5) 

The multiplicity of the zero eigenvalue of the Laplacian is equal to the number of connected components 
of the graph. Thus, for a connected graph, A2(£) > 0. In the literature, A2OC) is referred to as the 
algebraic connectivity (or Fiedler value) of the network (see [18].) The normalized eigenvector ui(£) 
corresponding to the zero eigenvalue is the normalized vector of ones 

T 



1 1 



(6) 
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For additional concepts from graph theory see [19], [20], [21]. 

2) Random Topology: We consider sensor networks where failures may occur at random due to noise 
as when packets are dropped. If a link fails at time i, it can come back online at a later time (a failed 
transmission may be succeeded by a successful one.) We describe a graph model for this random topology. 
We start with the model in paragraph II-A. 1 of a simple, connected supergraph G = (V,£) with |V| = iV 
and \£\ = M. The superset of edges £ collects the realizable edges, i.e., the channels that are established 
directly among sensors in the network when all realizable links are online. These channels may fail at 
random times, but if (n, I) £ £ then sensors n and I do not communicate directly — of course, they still 
communicate by rerouting their messages through one of the paths connecting them in G, since G is 
connected. We now construct the model for the random topology problem, see also [13], [14], [15]. 

To model this network with random link failures, we assume that the state, failed or online, of each link 
(n, I) G £ over time i = 1, • • • is a Bernoulli process with probability of formation P n \, i.e., the probability 
of failure at time i is 1 — P n \. We assume that for any realizable edges (n, /) / (m, k) the corresponding 
Bernoulli processes are statistically independent. Under this model, at each time i, the the resulting 
topology is described by a graph G(i) = (V, E(i)). The edge set E{i) and the adjacency matrix A(i) are 
random, with E(i) and E(j), as well as A(i) and A(j), statistically independent, identically distributed 
(iid) for i / j. Note that E(i) C £ and ^ A(i) ^ A, where is the N x N zero matrix and C ■< D 
stands for VI < i, j < N : Cij < Dij. We can think of the set E(i) as an instantiation of a random 
binary valued M-tuple. The probability of a particular instantiation E(i) is H(n,i)e£Pnl- We collect the 
edge formation probabilities in the edge formation probability matrix 

P = P T = [P nl ] , P n , n = 

The diagonal elements are zero because the graph is simple (no loops). The structure of P reflects the 
structure of the adjacency matrix A of the superset £ , i.e., P n \ ^ if and only if A n i = 1. The matrix P 
is not stochastic; its elements are < P n i < 1 but their row or column sums are not normalized to 1. 
Abusing notation, we will refer to P as the probability distribution of the E(i) and A(i). 

We now consider the average consensus algorithm for both nonrandom and random topologies. 

B. Average Consensus 

We overview average consensus, see [1], [2] and also for recent work [3]. It computes by a distributed 
algorithm the average of x n (0), n = 1, • • • , N where x n (0) is available at sensor n. At time i, each 
node exchanges its state x n (i), i = 0, 1, ■ ■ ■ synchronously with its neighbors specified by the graph 
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edge neighborhood set, see eqn. (2). In vector form, the N states x n (i) are collected in the state vector 
x(i) e M iVxl . Define the average r and the vector of averages x avg 

v = ^l T x(0) (7) 

X avg = fl (8) 

= ^H T x(0) (9) 
= -^x(O) (10) 

and where 1 is the vector of ones, see (6), and J = 11 T . We next consider the iterative average consensus 
algorithm for both nonrandom and random topologies. 

1 ) Average consensus: Nonrandom topology: With the nonrandom topology defined by the supergraph 
G = (V, £), the state update by the average consensus proceeds according to the iterative algorithm 

Vi > : X n (i + 1) = W nn X n {i) + W nl Xi{i) (11) 

x(i + l) = Wx(i) (12) 

where: Jl n is the neighborhood of sensor n; x(z) is the state vector collecting all states x n (i), 1 < n < N; 
W n i is the weight of edge (n, I); and the matrix of weights is W = \W n i\. The sparsity of W is determined 
by the underlying network connectivity, i.e., for n^l, the weight W n i = if (n, /) ^ £. Iterating (12), 

x(i) = ^llwjx(O) (13) 

= W*x(0) (14) 

A common choice for the weight matrix W is the equal weights matrix, [22], 

W = I-a£ (15) 

where C is the Laplacian associated with £ , and a > is a constant independent of time i For the equal 
weights matrix and a connected network, given the ordering (5) of the eigenvalues of C, and that a is 
nonnegative, the eigenvalues of W can be reordered as 

1 = Ai (W) > A 2 (W) > ••• > Ajv(W) (16) 

The eigenvector corresponding to Ai (W) is still the vector ui (W) = -^1- 
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Reference [22] studies the problem of optimizing the nonzero weights W n i for maximizing convergence 
rate when the adjacency matrix A is known. In particular, this reference shows that, for the equal weights 
case, fastest convergence is obtained with 

2 

°* = A 2 (C) + X N (Z) (17) 

In [9], [10], [11], we consider this equal weight W and show that the class of non-bipartite Ramanujan 
graphs provides the optimal (nonrandom) topology under a constraint on the number of network links 
M, see also [12]. This optimality is in the asymptotic limit of large N, see the references for details. 

2) Average consensus: Random topology: At each time i, the graph G(i) = (V, E(i)) is random. The 
distributed average consensus algorithm still follows a vector iterative equation like (12), except now the 
weight matrices W(i) are time dependent and random. We focus on the equal weights problem, 

W{i) = I-aL[i) (18) 

where L(i) is the Laplacian of the random network at time i. The L(i) are random iid matrices whose 
probability distribution is determined by the edge formation probability matrix P. Likewise, the weight 
matrices W[i), i = 0, 1, ... are also iid random matrices. We often drop the time index i in the random 
matrices L(i) and W(i) or their statistics. Iterating (12) with this time dependent weight matrix leads to 

x(t)= ^IJw(i)jx(0) (19) 

Since the weights W n \ are random, the state x(i) is also a random vector. Section IV analyzes the 
influence of the topology on the convergence properties as we iterate (19). 

III. Preliminary Results 

Subsection II-B.2 describes the random topology model. The supergraph G = (V,£) is connected 
and P is the matrix of edge formation probabilities. Since the A(i), L(i), and W[i) are iid 

A = E [A{i)\ (20) 

L = E[L{i)\ (21) 

W = E[W(i)} (22) 

= I-aL (23) 
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i.e., their means are time independent. We establish properties of the Laplacian, Subsection III-A, and 
weight matrices, Subsection III-B, needed when studying the random topology and random topology with 
communication cost constraint problems in sections IV through VI. 

A. Laplacian 

We list some properties of the mean Laplacian and bound the expected value of the algebraic connec- 
tivity of the random Laplacians by the algebraic connectivity of the mean Laplacian. 

Lemma 1 The mean adjacency matrix A and mean Laplacian are given by 

A = P (24) 

i„, = ( E"=l P nm if» = i (25) 

[ —P n i otherwise 

This Lemma is straightforward to prove. From the Lemma, it follows that the mean adjacency matrix A 
is not a (0, 1) matrix. Similarly, from the structure of the matrix L, see eqn. (25), it follows that L can 
be interpreted as the weighted Laplacian of a graph G with non-negative link weights. In particular, the 
weight of the link (n, I) of G is P n \. The properties of the mean Laplacian are similar to the properties 
of the Laplacian. We state them in the following two Lemmas. 

Lemma 2 The mean Laplacian matrix L = E [L{j)\ , j = 0, 1, ... is positive semidefmite. Its eigenvalues 
can be arranged as 

= Ai(Z)<A 2 (!)<■■■< Xn (L) (26) 
where the normalized eigenvector associated with the zero eigenvalue Ai (L) is 

Ul (X) = -Ll (27, 
Proof: Let z £ R Nxl be a non-zero vector. Then, from eqn. (25), we have 



Lz = ^2 L nl z n zi = - ^ p m(z n ~ zif (28) 



n,l riy^l 



Since the P n fs are non-negative, L is positive semidefmite. Eqn.(27) follows readily from eqn.(28). ■ 
Interpreting L as the weighted Laplacian of the graph G, we note that A2 {L) = implies that G 
is not connected (see [23], [19].) In other words, if A2 {L) = 0, then G has at least two disconnected 
components; hence, L takes the form of a block diagonal matrix (after permuting the rows and columns). 
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Such matrices are called reducible matrices. Also, it immediately follows (see [23]) that, if L is irreducible, 
then A2 [L) / 0. Thus, we get the following Lemma. 

Lemma 3 Let the mean Laplacian be the weighted Laplacian for a graph G. 

X 2 (L) > <^=^ L is irreducible <^=^ Gis connected (29) 

The convergence results in Section IV-A on the average consensus involve the mean E [A2(L)], which 
is manifestly difficult to compute. A much easier quantity to compute is A2 (£) - We relate here the two. 
First, we show that X 2 (L) is a concave function of L. 

Lemma 4 X 2 (L) is a concave function of L. 

Proof: From the Courant-Fisher Theorem (see [19], [20]) 

, x z T Lz 
A 2 (L = min ^^- (30) 

z_Ll Z 1 Z 

Then for any two Laplacians L\ and L2 and < t < 1 we have 

A 2( *L 1 + ( 1-*)L 2 ) = min - T (^ + (l-t)L 2 ) Z 

z±l Z 1 Z 

z T L\z z T L 2 z 

> tmin — = h (1 — tlmin — ^ — 

z_Ll Z 1 Z z±l Z 1 Z 

= tX 2 (L 1 ) + (l-t)X 2 (L 2 ) 
Thus X 2 (L) is a concave function of L. ■ 

Lemma 5 

E[A 2 (L)] < A 2 (L) (32) 
Proof: Follows from Lemma 4 and Jensen's inequality. ■ 

B. Weight matrices 

We consider properties of the (random and mean) weight matrices. 

Lemma 6 The eigenvalues of W are 

1 < j < N : Xj (W) = 1 - aXj (I) (33) 
1 = Ai (W) > X 2 (W) • • • > Atv (W) (34) 
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The eigenvector corresponding to the eigenvalue Ai (W) is 



ui (W) = -4=1 



N 



(35) 



Similar results hold for W(i) 



This Lemma follows immediately from the corresponding results on the mean Laplacian and the L(i). 

We now consider results on the spectral norm and its expected value for the random matrices W(i) and 
their mean W. These results are used when studying convergence of the average consensus in Section IV. 



Lemma 7 Let z G R Nxl and p(-) be the spectral radius. Then 



W(j)z - - Jz 



<P[W(j)--J 



Z ~N JZ 



(36) 



Proof: Decompose W(j) through orthonormal eigenvectors as W(j) = U(j)A(j)U(j) T . From 
eqn. (34), \i(W(j)) = 1 with normalized eigenvector ui(j) = ^=1. Hence, 



where c k (j) = u k (j) T z, k = 2, N. Then 



1 N 



k=2 



1 N 

W(j)z = - Jz + J2c k (j)X k (W(j))u k (j) 



It follows that 



W(j)z - - Jz 



k=2 



N 



Y, c ^k{w{j))M k {j) 

k=2 

N 



= P[W(j)--J 



^2c k (j)u k (j) 



k=2 



Z ~N JZ 



This proves the Lemma. 



Lemma 8 We have 



(37) 



(38) 



(39) 



P (wii) - 1 J 



= max (|A 2 (W) \, \X N (W) |) = max (A 2 (W) , -X N (W)) 



(40) 



max (|A 2 (W(i)) \,\\ N (W(i)) |) = max (A 2 (W(i)) , -X N {W{i))) (41) 
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Proof: We prove only the Lemma for W. Matrix ^ J is rank one, and the its non-zero eigenvalue 
is 1 with normalized eigenvector -^1- Hence, from eqn. (34), the eigenvalues of (W — jqJ) are and 
A 2 (W) , Ajv {W). By the definition of spectral radius and eqn. (34), 



p(w- 1 jj = max (0, |A 2 (W) |, |Ajv {W) |) = max (|A 2 (W) \,\\ N (W) |) (42) 



Also, noting that A 2 (W) > Xn (W), it follows from eqn. (42) that 

9 ( W " ^ J ) = maX ^ 2 ^ ' ~ Xn ^ 



(43) 



We now consider the convexity of the spectral norm as a function of a and L. 

Lemma 9 For a given L, p (W — jqJ) is a convex function of a. For a given a, p (W — jjJ) is a 
convex function of L. 

Proof: We prove the convexity with respect to a only. Let a\,(X2 € M and < i < 1. For symmetric 
matrices the spectral radius is equal to the matrix 2-norm. We get 



,)[ I - (tai + (1 - t)a 2 ) L - -L J 



1 

7 - toiL - (1 - t)a 2 L - — J 



/ ( Z -aiL-^j) f/-a 2 L-^J 



< 



+ 



(1-/) ( l -a 2 L-±J 



= tp(j-a 1 L-±j)+(l-t)p[l-a 2 L-±j) (44) 

that proves the Lemma. ■ 
The next Lemma considers the convexity of the expected value of the spectral norm, taken over the 
probability distribution of the Laplacian. The following Lemma bounds E [p (W — jjJ)]- 

Lemma 10 For a given probability distribution (and hence P) of L, E [p (W — jjJ)] is convex on a. 

Proof: The convexity of E [p (PF — j?J)] follows from Lemma 9, eqn. (44), and the properties of 
Lebesgue integration. ■ 



Lemma 11 For a given choice of a, 



P\W--J 



>p(w-±j 



(45) 
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Proof: The Lemma follows from Lemma 9 and Jensen's inequality. ■ 

IV. Convergence of Average Consensus: Random Topology 
For average consensus in random topologies, we start by considering the convergence of the state 

Vx(0) eR Nxl : lim x(i) = x avg (46) 

in some appropriate probabilistic sense. Subsection IV-A studies convergence of the mean vector, E [x(z)], 
Subsection IV-B considers convergence in the mean-square-sense (mss), and almost sure convergence 
(convergence with probability 1) is treated in Subsection IV-C. 

A. Mean state convergence 

The sequence of expected state vectors converges if 

lim ||Ex(i) - x avg || = (47) 

For simplicity, we assume || • || to be the £ 2 -norm. We analyze the convergence of the mean state vector 
in IV-A. 1 and then study the topology that optimizes its convergence rate in IV-A.2. 
1) Mean state convergence: The mean state evolution is given in the following Lemma. 

Lemma 12 Recall x avg given in (8). Then 

Ex(i) - x avg =(w- ±j) (x(0) - x ; 



avg; 



(48) 



Proof: 

Using eqn. (19) and the fact that the matrices W(i) are iid 

E [x(t)] = W\(0) (49) 
The Lemma follows by recalling that 1 is an eigenvector of W. 

m 

Convergence of the mean is now straightforward. 

Theorem 13 A necessary and sufficient condition for the mean to converge is 
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Proof: Lemma 12 shows that the convergence of the mean is equivalent to deterministic dis- 
tributed average consensus. The necessary and sufficient condition for convergence then follows from 
references [11], [24]. ■ 
2) Fastest mean convergence topology: We introduce the definition of convergence factor. 

Definition 14 (Mean convergence factor) If p (W — J) < 1, we call p (W — J) the mean conver- 
gence factor of the consensus algorithm. 

For fastest mean convergence, p (W — J) should be as small as possible. Hence, the optimal topology 
with respect to convergence of the mean state vector is the topology that minimizes this convergence 
factor. We address this problem in the following two Theorems. 

We note that p(W — -jN) is a function of both a and L. In the following Theorem, we state conditions 
on L that guarantee that we can choose an a for which there is convergence of the mean. 

Theorem 15 A necessary condition for the mean to converge is 

A 2 (L) > (51) 

A sufficient condition is (51) and 

< a < 2/X N (L) (52) 

Proof: We first prove the necessary condition by contradiction. Let A2 (L) = 0. From eqn. (33), 
it follows that A2 (W) = 1. Then, from eqn. (40), we have p(W-±j) > 1, for every choice of a. 
Hence, from Lemma 13, it follows that, if A2 (L) = 0, the mean vector does not converge for any choice 
of a. This proves the necessary condition. 

For sufficiency, we assume that A2 (L) > 0. Then, generalizing the results in [24] to non-binary (0—1) 
matrices, it can be shown that 



p(w- < 1 iff < a < 2/X N (L) 



which then guarantees convergence of the mean state vector. ■ 
If A 2 (I) > 0, Theorem 15 and eqn. (52) give the values of a that lead to the convergence of the 

mean vector in terms of A at (L), a quantity easily evaluated since L is given by eqn. (25). 
The following Theorem gives the choice of a leading to the fastest convergence of the mean. 
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Theorem 16 Let A2 {L) > 0. Then the choice of a that minimizes p (W — jfj) and hence maximizes 
the convergence rate of the mean state vector is 

2 



a 



The corresponding minimum /?(•) is 



Pmin [W - — J 



A 2 (L) + X N (L) 

1 - A 2 (I) /X N (L) 



(53) 



(54) 



V N J l + X 2 (L) /X N (L) 
Proof: It follows by generalizing the result in [24] to non-binary matrices. ■ 
This section derived necessary and sufficient conditions for the convergence of the mean in terms 
of A2 (L). Also, it provided the values of a that guarantee convergence when A 2 (L) > 0. The next 
Subsection considers mss convergence of average consensus. 

B. Mean Square Convergence 

This Section studies mean-square convergence, which implies convergence of the mean, but not the 
reverse. We say that the algorithm converges in the mean-square sense (mss) iff 



Vx(0) € 



We need the following lemma first. 



: lim E x(i) — x ; 



avg| 







(55) 



Lemma 17 For any x(0) G R Nxl 

||x(i + 1) - x avg || < \J[p (w(j) - ±j) J ||x(0) - 



x 



avg| 



(56) 



Proof: 
We have 



|x(i + 1) - X; 



avg| 



nw(i))x(o)-ijx(o) 



(57) 



W(i) ( I] W(i)x(0) ) - Ij ( jjw(j)x(0) 



where we have used the fact that 



^(n^oxo)) =^jx (0 ) 
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From Lemma 7, it then follows 



x(i + 1) - Xavgll < p ( W(i) — — J 



N 



nwi -^(n^(i)x(o) 



(58) 



Repeating the same argument for j = to i we finally get 

||x(i + l)-x avg || < I f[p(w(j)-^j) \ ||x(0)- 



x 



avg| 



(59) 



This proves the Lemma. ■ 
The following Theorem gives a sufficient condition for mss convergence. 

Theorem 18 If E [p (W — jjj)] < 1, the state vector sequence {x(i)}^ converges in the mss 

lim E ||x(i) - x avg || = 0, Vx(0) G M iVxl (60) 



Proof: Taking expectation on both sides of eqn. (56) in Lemma 17 and using the iid of the VF(j)'s 

i-l 

11*0- Xavg|| (61) 



E J | ^-i ^avg j| ^ I E 



l 



p[w--J 



where we dropped the index i in W(i). The Theorem then follows. ■ 
Like the Definition 14 for mean convergence factor, we introduce the mss convergence factor. First, note 
that E [p (W — jjJ)] is a function of the weight a and the probability of edge formation matrix P (or 
I from (25).) 

Definition 19 (mss convergence factor, mss convergence rate) If E [p (W — j?J)] < 1, call C (a, L) 
and S s (a, L) the mss convergence factor and the mss convergence gain per iteration (or the mss conver- 
gence rate), respectively, where 

(62) 
(63) 
(64) 



P[W--J 



C(a,L) = E 

S g (a,L) = -lnC(a,L) 
( 1 

= In 



*\p{w--kJ)]J 

Corollary 20 mss convergence cannot be faster than convergence of the mean vector. 
The Corollary follows from the Theorem and Lemma 1 1 . 
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Theorem 18 shows that the smaller the mss convergence factor C (a, L) = E [p (W - ± J)] is, the 
faster the mss convergence. The actual value of C (a, L) depends both on the probability distribution 
of the Laplacian L and the constant weight a. However, the probability distribution of L must satisfy 
certain conditions to guarantee that there are values of a that lead to mss convergence. Otherwise, no 
choice of a will result in mss convergence. The next Theorem considers this issue. Before stating the 
Theorem, let d max be the maximum degree of the graph with edge set E = £ and define 

"mss = TTJ— (65) 

Theorem 21 There is an a such that the consensus algorithm converges in mss iff A2 (L) > 0. In 
other words, if A2 (L) > 0, we can find an a, in particular, a = a mss defined in (65), that leads to 
mss convergence. If A2 (L) = 0, no choice of a will result in mss convergence. 

Proof: We first prove the sufficiency part. The proof is constructive, and we show that, if A2 (L) > 0, 
we can find an a for which 

C(a,L) =E 

Convergence then follows from Theorem 18. 

Let A2 (L) > 0. By Lemma 3, L is irreducible. From irreducibility of L, with non-zero probability, 
we have graph realizations for which L is irreducible and so \2(L) > 0. In particular, with non-zero 
probability, we can have a realization for which the edge set E = £; by assumption, this network 
is irreducible and hence connected (because the corresponding Laplacian matrix has the same sparsity 
pattern of L with non-zero entries of L replaced by ones.) Hence, with non-zero probability, A2(£) > 0, 
which makes E [A2(£)] > 0. Thus we have 

A 2 (I) >0^E[A 2 (L)] >0 (66) 

Let d max (G) be the maximum vertex degree of graph G. Then, from spectral graph theory, see [23], 

Xn(L{G)) < 2d max (G) (67) 

We now claim mss convergence for a = a mss . From Lemma 8 and (33), 

P { W ~1V J ) = m ^( X 2(W),-X N (W)) (68) 
= max (1 - a mss A 2 (L), a m ^X N (L) - 1) 
= 1 - am^X 2 (L) 



< 1 
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where the last step follows from the fact that from eqn. (67) and (65) 

1 - a mss A 2 (L) > > a mss Ajv(L) - 1 (69) 

Taking expectation on both sides of eqn. (68), and since < E [A 2 (L)] < 2<i max , we get 

C(a,L) = E 

= 1 -a mss E[A 2 (L)] 
< 1 

mss convergence then follows from Theorem 15. This proves the sufficiency part. 

The necessary condition follows from the fact that, if A 2 (L) = 0, Theorem 15 precludes convergence 
of the mean vector. Since, by Corollary 20, convergence of the mean is necessary for mss convergence, 
we conclude that, if A 2 (L) = 0, no choice of a will result in mss convergence. ■ 

Theorem 21 gives necessary and sufficient conditions on the probability distribution of the Laplacian 
L for mean square convergence. This is significant as it relates mss convergence to the network topology. 
Because this condition is in terms of the algebraic connectivity of the mean Laplacian associated with 
the probability distribution of edge formation P, it is straightforward to check. 



N 



-J 



(70) 



C. Almost Sure Convergence 

We extend the results of the earlier sections and show that A 2 (L) > is also a necessary and sufficient 
condition for a.s. convergence of the sequence {x(i)}°^ . Before proceeding to a formal statement and 
proof of this, we recall some basic facts about the convergence of (scalar) random variables. 

Definition 22 (A.S. Convergence of random variables) Let {£j}°^ be a sequence of random variables 
defined on some common probability space (f2,.F,P). Then {£j}°^ converges a.s. to another random 
variable £ defined on J 7 , P) (£j — > £a.s.) if 

pf^eft : >Z(u) \ =1 (71) 

\ I— >oo J 

This definition readily extends to random vectors, where a.s. convergence means a.s. convergence of 
each component (see [25], [26].) 

We also recall that mss convergence of a sequence of random variables {x(i)}°^ implies convergence 
in probability through Chebyshev's inequality. Also, we note that convergence in probability implies 
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a.s. convergence of a subsequence (see [27], [26].) 

We now formalize the theorem for almost sure convergence of the state vector sequence {x(i)}°^ . 

Theorem 23 A necessary and sufficient condition for a.s. convergence of the sequence {x(i)}°^ is 
A2 (L) > 0. In other words, if A2 (£) > 0, then there exists an ol such that x(i) — ► x aV ga.s. On the 
contrary, if A2 (L) = then no choice of a leads to a.s. convergence. 

Proof: We prove the sufficiency part first. Like Theorem 21 we give a constructive proof. We claim 
that the choice of a = a mss = l/2eZ max (see eqn.(65)) leads to a.s. convergence. To this end, define the 
sequence of random variables, 

||x(i)-x avg || 1/2 (72) 
It follows from the properties of finite dimensional real number sequences (see [28]) that 

x(z) — > x avg a.s. 44> & — > a.s. (73) 

From Theorem 21 we note that 

(74) 

Thus & — > in probability and there exists a subsequence {£i fc }£l which converges to a.s. Also we 
note from eqn.(67) that < a mss < 1. Then, from eqn.(68), it follows that 

p(w-±j)<l (75) 

Hence from Lemma 7 we have 

£ < pfwii-l)-^^ (76) 

< £1 

Thus {^i}^ is a non-increasing sequence of random variables, a subsequence of which converges a.s. to 
0. By the properties of real valued sequences £j —> a.s. The sufficiency part then follows from (72). 

The necessary part is trivial, because \2{L) = implies that the network always separates into at least 
two components with zero probability of communication between them. Hence no weight assignment 
scheme can lead to a.s. convergence. ■ 

A note on Theorems 21 and 23: We consider only equal weights, i.e., all the link weights are assigned 
the same weight a. However, it is interesting that, whatever the weights in particular, different weights 
for different links, a necessary condition for mss convergence (and a.s. convergence) is A2 (L) > 0. 
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This is because (as argued in Theorem 23) if A2 (L) = 0, the network separates into two components 
with zero probability of communication between each other. Hence, no weight assignment can lead to 
mss convergence. Thus, the necessary condition established in Theorems 21 and 23 for mss convergence 
and a.s. convergence respectively in the constant link weight case holds for the more general weight 
assignments also. In other words, if we have a weight assignment (with possibly different weights for 
different links) for which the consensus algorithm converges in mss (and a.s.), then we can always find 
a constant weight a for which the consensus algorithm converges in mss (and a.s.) 

V. MSS Convergence Rate 

We study now the mss convergence of the algorithm through the convergence metrics given in Def- 
initions 19. In the sequel, whenever we refer to convergence rate of the algorithm, we mean the mss 
convergence gain per iteration, S g (a,L), unless otherwise stated. We derive bounds on the mss con- 
vergence rate of the algorithm. We assume that A2 [L) > 0. Hence, by Theorem 21, there exists a, in 
particular, a mss , leading to mss convergence. However, given a particular distribution of the Laplacian 
L, the actual choice of a plays a significant role in determining the convergence rate. Thus, given a 
particular distribution of L, we must choose that value of a that maximizes the convergence speed. From 
Theorem 18, we note that, the smaller the mss-convergence factor C (a, L) given by (62) is, the faster the 
convergence is. For a given edge formation probability distribution P (and hence L), the value of C [a, L) 
depends on a. Thus, to maximize convergence speed for a given P, we perform the minimization 

C* (I) = min C(a,L) (77) 



min E 

a 



We present the results in terms of the best achievable mss convergence rate S* (L) 

S*(L) = -\nC*(L) (78) 

The minimization in eqn. (77) is difficult. It depends on the probability distribution of the Laplacian L. 
But, by Lemma 10, C [a, L) is convex on a for a given L; so, its minimum is attainable using numerical 
procedures. In performing this minimization, we do not need to consider the entire real line for finding 
the optimal a. The following Lemma provides a range where the optimal a lies. 



Lemma 24 Let A2 (L) > 0. Then 



< a* < —=- (79) 
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Proof: Since A2 (L) > 0, by Theorem 21, we can find a that leads to mss convergence. But, a 
necessary condition for mss convergence is convergence of the mean vector. From section IV-A, the mean 
converges only if 



< a < 



Xn {L) 



(80) 



Hence, the optimal a* leading to fastest mss convergence must also belong to this range. 
We can bound the optimal mss convergence rate S*(L). 



Lemma 25 If A2 (L) > 0, then 



S*JL) > In 



1 - a mss E [A 2 (£)]_ 

Proof: By Theorem 21, if A2 (L) > 0, then a = a mss leads to mss convergence and 

C(a mss ,L) = 



(81) 



P\W--J 



= 1 - a mss E[A 2 (L)] 
> C* (L) 



(82) 



(83) 



The Lemma then follows because 



s*jl) 



In 



> In 



= In 



1 



C* (L) 
1 



C (a mss ,L) J 
1 

1 - a mss E [X 2 (L)] 



(84) 

(85) 
(86) 



VI. Consensus With Communication Constraints: Topology Optimization 

In the previous sections, we analyzed the impact of the probability distribution D of the network 
topology on the mss convergence rate of the distributed average consensus algorithm. This section studies 
the problem of sensor network topology optimization for fast consensus in the presence of inter-sensor 
communication (or infrastructure) cost constraints. We assume equal link weights throughout. 

We consider N sensors and a symmetric cost matrix C, where the entry C n \ is the cost (communication 
or infrastructure) incurred per iteration when sensors n and I communicate. The goal is to design the 
connectivity graph that leads to the fastest convergence rate under a constraint on the total communication 
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cost per iteration. Depending on the structure of the cost matrix C and the network topology (deterministic 
or randomized), this optimization problem may have the following variants: 

1) Fixed topology with equal costs: Here the entries C n \ of the cost matrix C are all equal and we 
look for the optimal fixed or deterministic topology leading to fastest convergence of the consensus 
algorithm. It is easy to see that the equal cost assumption translates into a constraint on the number 
of network links and the optimal solution is essentially the class of non-bipartite Ramanujan graphs 
(see [9], [10], [11].) 

2) Fixed topology with different costs (FCCC): In this case the inter-sensor costs C n \ may be different, 
and we seek the optimal fixed or deterministic topology leading to fastest convergence. This is a 
difficult combinatorial optimization problem and there is no closed form solution in general. 

3) Random topology with different costs (RCCC): This is the most general problem, where the costs 
C n i may be different and we look for the optimal (random or deterministic) topology leading to 
the fastest convergence rate under a communication cost constraint. Because the network is random, 
it makes sense to constrain the (network) average (expected) communication cost per iteration. 
Likewise, convergence should also be interpreted in a probabilistic sense, for example, the mean 
square convergence. To summarize, in the RCCC problem, we are concerned with: (i) designing the 
optimal probability of edge formation matrix P, (ii) under an average communication cost constraint, 
(iii) leading to the fastest mss convergence rate. RCCC reduces to FCCC, if the entries of the optimal 
P are or 1. In this sense, the RCCC problem relaxes the difficult combinatorial FCCC problem 
and, as we will see later, will usually lead to better overall solutions, especially under medium to 
low communication cost constraints. This is because with a fixed topology, we are forced to use 
the same network always, while in the random topology case we can occasionally make use of 
very good networks, still satisfying the cost constraint. We can draw an analogy between RCCC 
and gossip algorithms (see [15].) However the context and assumptions of the two problems are 
different. Reference [15] optimizes the gossip probabilities for a given network topology under the 
gossip protocol — only two nodes, randomly selected with gossip probability, can communicate at 
each iteration — and [15] does not impose a communication cost constraint. In contrast, we design 
the optimal (equal) weight a and the optimal P matrix leading to the fastest mss convergence rate, 
under an average cost constraint. The topology solution that we determine gives the percentage of 
time a link is to be used, or, as another interpretation, the probability of error asssociated with 
reliable communication in a given link. Because signal-to-noise ratio (SNR) determines often the 
probability of error, enforcing the topology, i.e., P, is like selecting the SNR for each link. 
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A. Random Topology with Communication Cost Constraints (RCCC) 

We are given N sensors. We model the cost of communication by an N x N matrix C = C T . The 
entry C n i > 0, n / I, is the cost incurred by a single communication between nodes n and I. Entry 
C n i = +00 precludes sensors n and I from communicating. Let P be the probability of edge formation 
matrix. The diagonal entries of P are zero, although each node can access its data with zero cost. The P 
matrix induces a probability distribution on the Laplacian L(i), which at time i is a random instantiation 
based on the P matrix. The total cost incurred at stage i is 

= -\Y, L ^) c m ( 87 ) 

This follows from C being symmetric with zero diagonal entries. Since L{i) is a random matrix, the 
cost Ui incurred at step i is random. From (87), the expected cost incurred at step i is 

Vi: E [ui] - ^Tr (CL) (88) 

We consider the distributed averaging consensus model with equal link weights given in eqns. (12) 
and (18). From Section IV-B, mss convergence is determined by the convergence factor C (a,L) = 
E [p (W — jfJ)] or the convergence rate S g (a,L) defined in (63). In particular, the smaller C (a,L) 
(or larger S g (a, L)) is, the faster the convergence rate. The expected cost per iteration step in eqn. (88) 
depends on L and hence P, which are in 1 <-> 1 correspondence. 

Let V(U) be the set of feasible L (and hence P) given a constraint U on the expected cost per step 



V{U) = {l:- 1 -Tv(CL)<u} 



(89) 



The RCCC problem can then be stated formally as: 
RCCC: Problem formulation. 



max S g (a, L) (90) 

a,L 

subject to L = L T £ R NxN 

-1< Lni <0, n,l€{l,..,N},n^l 

LI = 

-^Tr(CL) < U 
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The second inequality constraint comes from the fact that L n \ = —P n u n ^ I. The other inequalities 
follow from the properties of the Laplacian and the cost constraint. 

B. Alternate Randomized Consensus under Communication Cost Constraints (ARCCC) 

The RCCC problem in (90) is very difficult to solve. We formulate an alternate randomized consensus 
under communication cost constraints (ARCCC) problem. We show successively: (i) ARCCC is convex 
and can be solved by fast numerical optimization procedures; (ii) ARCCC is a good approximation 
to (90); and (iii) ARCCC leads to topologies with good convergence rates. Point (i) is in this section, 
while points (ii) and (iii) are studied in Section VII-C where we analyze the performance of ARCCC. 

ARCCC: Problem Formulation. 

max A 2 (!) (91) 

subject to L 

-1 < 
LI 

-\^{CL) 

Lemma 26 The optimization problem ARCCC in (91) is convex. 

Proof: From Lemma 4, it follows that the objective A2 {L) is a concave function of L. Also, the 
set of L satisfying the constraints forms a convex set. Hence, ARCCC maximizes a concave function 
over a convex set; so, it is convex. ■ 
The optimization problem in Lemma 26 is a semidefmite programming (SDP) problem that can be 
solved numerically in efficient ways, see references [29], [30] for SDP solving methods (see also [31], 
[32] for constrained optimization of graph Laplacian eigenvalues.) 

VII. Topology Optimization: Performance Results 

In this section, Subsection VILA discusses in what sense the ARCCC topology optimization problem 
introduced in Section VLB and eqn. (91) is a good approximation to the original RCCC topology 
optimization formulation of Section VI-A and eqn. (90). Subsection VILB establishes bounds on the 
optimal value as a function of the communication constraint. Finally, Subsection VII-C illustrates by a 
numerical study that the ARCCC optimization obtains topologies for which the distributed consensus 
exhibits fast convergence. 



= L G R NxN 

Lni < 0, n,le {l,..,N},n^ I 

= 

< U 
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A. ARCCC as a Good Approximation to RCCC 

The difficulty with RCCC stems from the fact that it involves joint optimization over both a and L. 
For a given L, there is, in general, no closed form solution of 

S:(L) = max S„ (a, L) (92) 

We first present a plausible argument of why ARCCC is a good surrogate for RCCC, and then present 
numerical results that justify this argument. 

We present a plausible argument in two steps. First, we replace in RCCC the maximization of S* (L) 
by the maximization of E [A2(L)]. We justify this step by noting that eqn. (86) bounds S*(L) from below 
and this lower bound shows that larger values of E [A2(L)] lead to higher S* (L). This suggests that, for 
a given set of distributions L € T>(U), the quantity E [A2(L)] may provide an ordering on the elements 
of V{U) with respect to the mss convergence rate S* (L). Hence, a topology with fast convergence rate 
satisfying the communication constraint U is provided by the distribution L* G V{U) that maximizes 
the quantity E [A 2 (L)] over the set V{U). 

This is not enough to get a reasonable topology optimization problem, since computing E [A2CL)] is 
costly, because its evaluation requires costly Monte-Carlo simulations (see [13].) The second step replaces 
the optimization of E [A2(£)] by the maximization of \2{L), which simply involves computing the second 
eigenvalue of P = L, no Monte Carlo simulations being involved. This step is justified on the basis of 
Lemma 5, which upper-bounds E [A2(£)] by \2{L). This suggests that for E [A2CL)] to be large, \2{L) 
should be large. 

Putting together the two steps, the RCCC problem in eqn. (90) is successively approximated by 

S* = max Sg(a,L) (93) 

a,LeV(U) 

w maxS'g(a,L*) 

a 

— c* 

- 

where L* is given by 

L* = arg _max A 2 Cl) (94) 

In general, S* < S*. If S g (ct,L) was a non-decreasing function of \2(L), we would have S* = S*. 

We verify by a numerical study how and in what sense S* (L) in (92) increases with E [A2(£)] and 
A 2 (L). In our simulation, we choose a network with N — 500 sensors and let the average degree d avg 
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of the network vary in steps of 5 from 10 to 40. For each of these 7 values of d avg , we construct 200 
Erdos-Renyi random graphs by choosing at random M = d avg N/2 edges of the N(N — l)/2 possible 
pairings of vertices in the network. For each of these 200 random graphs, we generate randomly a 
probability of formation matrix P (hence a probability distribution of L) by choosing for each edge a 
weight between and 1 from a uniform random distribution. For each such P matrix, we collect statistics 
on the convergence rate S* (L) and E [A2(X)] by generating 400 possible L(i). For each P, we also obtain 
the corresponding \2(L) by eqn. (25). This is an extensive and computationally expensive simulation. 
Fig. 1 displays the results by plotting the convergence rate S* (L) with respect to E[A2(£)], left plot, 
and with respect to A2 (L) , right plot. These two plots are remarkably similar and both show that, except 
for local oscillations, the trend of the convergence rate S* (L) is to increase with increasing E [A2(£)] 
and A2 (L). Of course, A2 (L) is much easier to evaluate than E[A2(£)]. The plots in Fig. 1 confirm 
that, given a class T>{U) of probability distributions of L, we can set an ordering in V{U) by evaluating 
the corresponding A2 (i) 's, in the sense that a larger value of A2 (i) leads to a better convergence rate 
in general (see also [13], where part of these results were presented.) This study shows that optimal 
topologies with respect to ARCCC should be good topologies with respect to RCCC. 




B. ARCCC: Performance Analysis 

To gain insight into ARCCC, we study the dependence of the maximum value of its functional 

(j){U) = _max A 2 (I) (95) 

Lev(u) 

on the value of the communication cost constraint U. We first establish the concavity of 4>(U). 
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(96) 



Lemma 27 Given a cost matrix C, <fi(U) is a concave function of U. 

Proof: Let < U\ < U2 and < t < 1. Consider the matrices L[ and L 2 , such that 

A 2 (r x ) = <f> (tfi) and A 2 (Z*) = (tf 2 ) 

It follows that 

Li G X>(L/i) and L* 2 G P(f7 2 ) 

Let L = tL\ + (1 - t)L* 2 . Then, 

-iTr{CL} = t(-lTr{c^})+(l-t)(~-ft{c^}) 

< tU! + (l-t)U 2 

Hence L G V {tU\ + (1 — t)U 2 ). From this we conclude that 

$ {m + (1 - i)tf 2 ) > A 2 (I) (97) 
Now, since A 2 (L) is a concave function of L (see Lemma 4), we get 

A 2 (L) = A 2 (tL[ + (1 - t)T 2 ) (98) 

> a 2 (if) + (1 - t)A 2 (z*) 

= ^(t7i) + (l-t)^(Z7 2 ) 

Finally, using eqns.(97 and 98), we get 

(tUt + (1 - t)f/ 2 ) > t<f> (C/i) + (1 - t)<£ (^2) (99) 

that establishes the concavity of 4>(U). ■ 
We use the concavity of 4>(U) to derive an upperbound on <p(U). Recall that M is the edge set of the 
complete graph-the set of all possible N(N — l)/2 edges. Define the set of realizable edges £ C M by 

£ {(n,l) G : C nt < 00} (100) 

and by Lg the associated Laplacian. Also, let the total cost C tot 

(n,0 e £ 



27 



The quantity C to t is the communication cost per iteration when all the realizable links are used. 

Lemma 28 Let C be a cost matrix and U > C tot . Then cj)(U) = A 2 (L e ). If £ = M, then <f>(U) = N. 

Proof: The best possible case is when all the network links (n,l) G £ have probability of formation 
P n i = 1 (the links in £ c must have zero probability of formation to satisfy the cost constraint.) In this 
case, L = Lg. Now, if U > C tot , then Lg E V(U) and hence the proof follows. The case £ = M follows 
from the fact that, for a complete graph, A2 (Am) = N (see [19], [20].) ■ 
Using the concavity of 4>(U) (Lemma 27), we now derive a performance bound when U < C tot . 

Lemma 29 Let C be a cost matrix. Then 

<P(U) > (J^J A 2 (L £ ) , 0<U<C tot (102) 

If £ = M, then 

<P(U) > (JPj N, 0<U< Cot (103) 

Proof: From Lemma 28, (f> (C tot ) A 2 {Lg). Then, using the concavity of 4>(U) (see Lemma 27) and 
the fact that 0(0) = 0, we have, for < U < C tot , 

<P{U) = ^((^) Ctot ) (104) 

This proves the Lemma. The case £ = M follows easily. ■ 
Lemma 28 states what should be expected, namely: to achieve the optimal performance A 2 {Lg) one 
needs no more than C tot . Lemma 29 is interesting since it states that the ARCCC optimal topology may 
achieve better performance than the fraction of communication cost it uses would lead us to expect. The 
numerical study in the next Section helps to quantify these qualitative assessments. 

C. Numerical Studies: ARCCC 

This Section solves the ARCCC semidefmite programming optimization given by (91). It solves for P, 
which assigns to each realizable link its probability of error (aka, SNR), or the fraction of time it is 
expected to be active. We compare the ARCCC optimal topology to a fixed radius connectivity (FRC) 
topology detailed below. The sensor network is displayed on the left of Fig. 2. We deploy N = 80 
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sensors uniformly on a 25 x 25 square grid on the plane. The set £ of realizable links is constructed 
by choosing \£\ = 9N edges randomly from the set M of all possible edges. We assume a geometric 
propagation model: the communication cost is proportional to the square of the Euclidean distance d n i 
between sensors n and I 

' V d? nl if (n,Z)€£ 
oo otherwise 



C n l 



(105) 



where rj is an appropriately chosen constant. With the FRC network, a sensor n communicates with all 
other sensors I (C n i < oo) that lie within a radius R. The FRC topology is an instantiation of a fixed, 
i.e., not random, topology with a fixed cost incurred per iteration. 

Fig. 2 on the right plots, as a function of the cost constraint U, the per step convergence gain S g = S* 
for the ARCCC optimal topology (top blue line) and the per step convergence gain S g of the FRC 
topology (bottom red line). The ARCCC optimal topology converges much faster than the FRC topology, 
with the improvement being more significant at medium to lower values of U. 

The ARCCC topology has a markedly nonlinear behavior, with two asymptotes: for small U, the sharp 
increasing asymptote, and the asymptotic horizontal asymptote (when all the realizable edges in £ are 
used.) The two meet at the knee of the curve (U = 6.9 x 10 4 , S g = .555) . For U = 6.9 x 10 4 , the ARCCC 
convergence rate is S g = .505, while FRC's is S g = .152, showing that ARCCC's topology is 3.3 times 
faster than FRC's. For this example, we compute C lot = 14.7 x 10 4 , which shows that ARCCC's optimal 
topology achieves the asymptotic performance while using less than 50 % of the communication cost. 
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Fig. 2. Left: Sensor placement of N = 80 sensors a 25 x 25 square grid (rj = 1.) Right: Convergence gain Sg vs. communication 
cost U : ARCCC optimal topology — top (red) line; FRC topology — bottom (blue) line. 
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VIII. Conclusions 

The paper presents the design of the topology of a sensor network to maximize the convergence rate of 
the consensus algorithm as a convex optimization problem. We consider that the communication channels 
among sensors may fail at random times, that communication among sensors incurs a cost, and that there 
is an overall communication cost constraint in the network. We first establish necessary and sufficient 
conditions for mss convergence and a.s. convergence in terms of the expected value of the algebraic 
connectivity of the random graph defining the network topology and in terms of the algebraic connectivity 
of the average topology. We apply semidefmite programming to solve numerically for the optimal topology 
design of the random network subject to the communication cost constraint. Because the topology is 
random, the solution to this optimization specifies for each realizable link its probability of error (aka, 
SNR), or the fraction of time the link is expected to be active. We show by a simulation study that 
the resulting topology design can improve by about 300 % the convergence speed of average consensus 
over more common designs, e.g., geometric topologies where sensors communicate with sensors within 
a fixed distance. Our study also shows that the optimal random topology can achieve the convergence 
speed of a non-random network at a fraction of the cost. 
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